fix: migrate Langfuse integration from start_generation to start_obse…#14205
fix: migrate Langfuse integration from start_generation to start_obse…#14205wangq8 merged 3 commits intoinfiniflow:mainfrom
Conversation
…rvation The Langfuse Python SDK v3+ removed `start_generation()` method. RagFlow's code called this non-existent method, causing AttributeError when Langfuse tracing is enabled. Replace all `start_generation()` calls with `start_observation(as_type="generation")` which is the correct v4 SDK API. Affected files: - api/db/services/llm_service.py (12 occurrences) - api/db/services/dialog_service.py (1 occurrence) Fixes infiniflow#14204 Related to infiniflow#9243 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (1)
✅ Files skipped from review due to trivial changes (1)
📝 WalkthroughWalkthroughReplaced Langfuse callsites that used Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@api/db/services/dialog_service.py`:
- Around line 513-514: The trace_id is generated with str(uuid.uuid4()) which
yields a hyphenated 36-char UUID and is incompatible with Langfuse; replace that
generation so trace_context["trace_id"] is a 32-character lowercase hex string
(use langfuse.create_trace_id() if available or uuid.uuid4().hex) where trace_id
is set and trace_context is built (references: trace_id, trace_context in this
module).
- Around line 779-783: The cleanup guard should not rely on
"langfuse_generation" appearing in locals(); initialize langfuse_generation =
None before the if langfuse_tracer block (where you call
langfuse_tracer.start_observation) and then in the finalization/cleanup use a
direct test like "if langfuse_generation is not None:" (or truthiness) to decide
whether to call finish/close on langfuse_generation; this ensures references to
langfuse_generation (created by start_observation) are safe in the nested
cleanup code.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: ad5f3804-5f7f-4f9c-9a43-893334ccadcb
📒 Files selected for processing (2)
api/db/services/dialog_service.pyapi/db/services/llm_service.py
✅ Files skipped from review due to trivial changes (1)
- api/db/services/llm_service.py
| if langfuse_tracer: | ||
| langfuse_generation = langfuse_tracer.start_generation( | ||
| langfuse_generation = langfuse_tracer.start_observation(as_type="generation", | ||
| trace_context=trace_context, name="chat", model=llm_model_config["llm_name"], | ||
| input={"prompt": prompt, "prompt4citation": prompt4citation, "messages": msg} | ||
| ) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify whether `decorate_answer` directly closes over `langfuse_generation`.
python - <<'PY'
import ast
from pathlib import Path
path = Path("api/db/services/dialog_service.py")
tree = ast.parse(path.read_text())
async_chat = next(
node for node in tree.body
if isinstance(node, ast.AsyncFunctionDef) and node.name == "async_chat"
)
decorate_answer = next(
node for node in async_chat.body
if isinstance(node, ast.FunctionDef) and node.name == "decorate_answer"
)
names = {n.id for n in ast.walk(decorate_answer) if isinstance(n, ast.Name)}
string_check = any(
isinstance(n, ast.Constant) and n.value == "langfuse_generation"
for n in ast.walk(decorate_answer)
)
print("direct_name_reference =", "langfuse_generation" in names)
print("string_check_present =", string_check)
PYRepository: infiniflow/ragflow
Length of output: 118
🏁 Script executed:
#!/bin/bash
# First, verify the file exists and check the relevant lines
head -n 800 api/db/services/dialog_service.py | tail -n 50Repository: infiniflow/ragflow
Length of output: 2836
🏁 Script executed:
#!/bin/bash
# Search for the cleanup path that checks langfuse_generation
rg -n "langfuse_generation" api/db/services/dialog_service.py -A 2 -B 2Repository: infiniflow/ragflow
Length of output: 1006
Fix the langfuse observation finalization guard condition.
The cleanup path at line 771 uses "langfuse_generation" in locals() which may fail even though the variable is directly referenced on lines 774–775. In a nested function, variables from the parent scope don't reliably appear in locals() before they're first assigned locally. Change the guard to directly test the variable instead:
Suggested fix
langfuse_tracer = None
+ langfuse_generation = None
trace_context = {}
@@
- if langfuse_tracer and "langfuse_generation" in locals():
+ if langfuse_generation is not None:
langfuse_output = "\n" + re.sub(r"^.*?(### Query:.*)", r"\1", prompt, flags=re.DOTALL)
langfuse_output = {"time_elapsed:": re.sub(r"\n", " \n", langfuse_output), "created_at": time.time()}
langfuse_generation.update(output=langfuse_output)
langfuse_generation.end()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@api/db/services/dialog_service.py` around lines 779 - 783, The cleanup guard
should not rely on "langfuse_generation" appearing in locals(); initialize
langfuse_generation = None before the if langfuse_tracer block (where you call
langfuse_tracer.start_observation) and then in the finalization/cleanup use a
direct test like "if langfuse_generation is not None:" (or truthiness) to decide
whether to call finish/close on langfuse_generation; this ensures references to
langfuse_generation (created by start_observation) are safe in the nested
cleanup code.
f82a1d7 to
57f2c6f
Compare
There was a problem hiding this comment.
♻️ Duplicate comments (1)
api/db/services/dialog_service.py (1)
779-783:⚠️ Potential issue | 🟠 MajorFinalize guard still prevents observation close in
decorate_answer.
langfuse_generationis created here, but Line [771] checks"langfuse_generation" in locals()inside the nested function. That can skip.update()/.end()even when the observation exists, leaving traces incomplete.Suggested fix
- langfuse_tracer = None + langfuse_tracer = None + langfuse_generation = None trace_context = {} @@ - if langfuse_tracer and "langfuse_generation" in locals(): + if langfuse_generation is not None: langfuse_output = "\n" + re.sub(r"^.*?(### Query:.*)", r"\1", prompt, flags=re.DOTALL) langfuse_output = {"time_elapsed:": re.sub(r"\n", " \n", langfuse_output), "created_at": time.time()} langfuse_generation.update(output=langfuse_output) langfuse_generation.end()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/db/services/dialog_service.py` around lines 779 - 783, The guard using "langfuse_generation" in locals() inside decorate_answer can miss an existing observation and skip .update()/.end(); to fix, initialize langfuse_generation = None in the outer scope before the if langfuse_tracer block so the name is always defined, then in the nested cleanup/close function check "if langfuse_generation is not None" (or truthy) and call langfuse_generation.update(...) / langfuse_generation.end(); if the nested function needs to assign to langfuse_generation, add a nonlocal langfuse_generation declaration to allow modifying the outer variable.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@api/db/services/dialog_service.py`:
- Around line 779-783: The guard using "langfuse_generation" in locals() inside
decorate_answer can miss an existing observation and skip .update()/.end(); to
fix, initialize langfuse_generation = None in the outer scope before the if
langfuse_tracer block so the name is always defined, then in the nested
cleanup/close function check "if langfuse_generation is not None" (or truthy)
and call langfuse_generation.update(...) / langfuse_generation.end(); if the
nested function needs to assign to langfuse_generation, add a nonlocal
langfuse_generation declaration to allow modifying the outer variable.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 00679a3d-4f00-4181-8081-aef9f0163f9f
📒 Files selected for processing (2)
api/db/services/dialog_service.pyapi/db/services/llm_service.py
✅ Files skipped from review due to trivial changes (1)
- api/db/services/llm_service.py
Lynn-Inf
left a comment
There was a problem hiding this comment.
Thanks for the PR! I agree with the changes.
One additional thing: besides updating the code to work with the v4 SDK, please also update the version constraints in pyproject.toml to match the changes in uv.lock. I see that uv.lock has already been updated to 4.0.1, so please update pyproject.toml accordingly as well.
|
@RazmikGevorgyan , merging this PR could unlock me and may others 🙏🏿 |
|
pushed fix, pls merge if it looks ok |
…rvation
The Langfuse Python SDK v3+ removed
start_generation()method. RagFlow's code called this non-existent method, causing AttributeError when Langfuse tracing is enabled.Replace all
start_generation()calls withstart_observation(as_type="generation")which is the correct v4 SDK API.Affected files:
Fixes #14204
Related to #9243
What problem does this PR solve?
Briefly describe what this PR aims to solve. Include background context that will help reviewers understand the purpose of the PR.
Type of change